Maxium Likelihood Non-linear Transformation for Environment Adaptation in Speech Recognition Systems
نویسندگان
چکیده
In this paper, we describe an adaptation method for speech recognition systems that is based on a piecewise-linear approximation to a non-linear transformation of the feature space. The method extends a previously proposed non-linear transformation (NLT) technique by making the transformation function more sophisticated (piecewise-linear instead of piecewiseconstant), and by computing the transformation to maximize the likelihood of the adaptation data given its transcription (instead of just matching the global statistics of the test and training data). This method also differs from other linear techniques (such as MLLR, linear feature space transforms, etc.) in two ways first, the computed transformation is non-linear, second, the tying structure of the transformation depends not on the phonetic class but rather on the location in the feature space. Experimental results show that the method performs well for the case of limited adaptation data, and the performance gains appear to be additive to those provided by MLLR yielding upto 3.4% relative improvement over MLLR.
منابع مشابه
Hindi Speech Recognition and Online Speaker Adaptation
Speaker Adaptation is a technique which is used to improve the recognition accuracy of Automatic Speech Recognition (ASR) systems. Here, we report a study of the impact of online speaker adaptation on the performance of a speaker independent, continuous speech recognition system for Hindi language. The speaker adaptation is performed using the Maximum Likelihood Linear Regression (MLLR) transfo...
متن کاملMaximum likelihood stochastic transformation adaptation for medium and small data sets
Speaker adaptation is recognized as an essential part of today’s large-vocabulary automatic speech recognition systems. A family of techniques that has been extensively applied for limited adaptation data is transformation-based adaptation. In transformation-based adaptation we partition our parameter space in a set of classes, estimate a transform (usually linear) for each class and apply the ...
متن کاملInvestigations on linear transformations for speaker adaptation and normalization
This thesis deals with linear transformations at various stages of the automatic speech recognition process. In current state-of-the-art speech recognition systems linear transformations are widely used to care for a potential mismatch of the training and testing data and thus enhance the recognition performance. A large number of approaches has been proposed in literature, though the connectio...
متن کاملEnvironment Adaptation and Long Term Parameters in Speaker Identi cation
In this paper, we have integrated in a GMM based speaker identi cation system two di erent techniques: a) Maximum Likelihood Linear Regression (MLLR) transformation which adapts the system to the new environment based on modifying the continuous densities of the GMM mixtures. We apply the MLLR to perform environmental compensation by reducing a mismatch due to channel or additive noise e ects, ...
متن کاملBayesian affine transformation of HMM parameters for instantaneous and supervised adaptation in telephone speech recognition
This paper proposes a Bayesian affine transformation of hidden Markov model (HMM) parameters for reducing the acoustic mismatch problem in telephone speech recognition. Our purpose is to transform the existing HMM parameters into its new version of specific telephone environment using affine function so as to improve the recognition rate. The maximum a posteriori (MAP) estimation which merges t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001